64 research outputs found

    Phonological Change in Cincinatti

    Get PDF

    The Reversal of a Sound Change in Cincinnati

    Get PDF

    MICK: A Meta-Learning Framework for Few-shot Relation Classification with Small Training Data

    Full text link
    Few-shot relation classification seeks to classify incoming query instances after meeting only few support instances. This ability is gained by training with large amount of in-domain annotated data. In this paper, we tackle an even harder problem by further limiting the amount of data available at training time. We propose a few-shot learning framework for relation classification, which is particularly powerful when the training data is very small. In this framework, models not only strive to classify query instances, but also seek underlying knowledge about the support instances to obtain better instance representations. The framework also includes a method for aggregating cross-domain knowledge into models by open-source task enrichment. Additionally, we construct a brand new dataset: the TinyRel-CM dataset, a few-shot relation classification dataset in health domain with purposely small training data and challenging relation classes. Experimental results demonstrate that our framework brings performance gains for most underlying classification models, outperforms the state-of-the-art results given small training data, and achieves competitive results with sufficiently large training data

    Preface

    Get PDF
    The University of Pennsylvania Working Papers in Linguistics (PWPL) is an occasional series published by the Penn Linguistics Club, the graduate student organization of the Linguistics Department of the University of Pennsylvania. The series has included volumes of previously unpublished work, or work in progress, by linguists with an ongoing affiliation with the Department, as well as volumes of papers from the NWAV conference and the Penn Linguistics Colloquium. We thank the Graduate Students Association Council of the University of Pennsylvania for financial support. This volume is the result of combined efforts of many people. Papers were selected and reviewed for content under the direction of the issue editors. Atissa Banuazizi did most of the legwork for collecting the papers, and the PWPL editors carried out the production of the actual volume. Special thanks are due to Hikyoung Lee for her production help, expert proofreading, and amazing post-its. All remaining errors are the responsibility of the series editors or the authors, as the case may be

    Parallel Aligned Treebank Corpora at LDC: Methodology, Annotation and Integration

    Get PDF
    Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors: Lars Ahrenberg, Jörg Tiedemann and Martin Volk. NEALT Proceedings Series, Vol. 10 (2010), 14-23. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15893

    Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text

    No full text
    This paper describes ongoing efforts at Linguistic Data Consortium to create shared evaluation resources for improved speech-to-text technology. The DARPA EARS Program (Effective, Affordable, Reusable Speech-to-Text) is focused on enabling core STT technology to produce rich, highly accurate output in a range of languages and speaking styles. The aggressive EARS program goals motivate new approaches to corpus creation and distribution. EARS research sites require multilingual broadcast news and telephone speech, transcripts and annotations at a much higher volume than for any previous technology program. In response to these demands, LDC has developed new corpora for training and evaluating speech-to-text systems in English, Arabic and Chinese and to support systems that distinguish speakers, identify and repair disfluencies and punctuate a text to improve readability
    corecore